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SEQUENCE LISTING 



5 (1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Hoechst Aktiengesellschaf t 

(B) STREET: - 

10 (C) CITY: Frankfurt 

(D) STATE : - 

(E) COUNTRY: Germany 

(F) POSTAL CODE (ZIP) : 65926 

(G) TELEPHONE: 069-305-7072 
15 (H) TELEFAX: 069-35-7175 

(I) TELEX: - 



20 



40 



45 



50 



(ii) TITLE OF INVENTION: Purification of higher order transcription 
complexes from transgenic non-human animals 

(iii) NUMBER OF SEQUENCES: 17 



2 (iv) COMPUTER READABLE FORM: 

!f (A) MEDIUM TYPE: Floppy disk 

£ 25 (B) COMPUTER: IBM PC compatible 

j J (C) OPERATING SYSTEM: PC - DOS /MS - DOS 

y (D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

30 (2) INFORMATION FOR SEQ ID NO: 1: 

i (i) SEQUENCE CHARACTERISTICS: 

j (A) LENGTH: 12 amino acids 

1 (B) TYPE: amino acid 

J35 (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION; 1 . . 12 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Val 
15 10 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS : 
55 (A) LENGTH: 11 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



10 



( ix) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..11 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
1 5 io 



20 



: t25 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



J30 



( ix) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..10 



135 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
15 io 



40 



(2) INFORMATION FOR SEQ ID NO: 4: 



45 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



50 



(ii) MOLECULE TYPE:: peptide 



55 



( ix) FEATURE : 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..9 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 



( ix) FEATURE : 

(A) NAME/KEY: exon 
20 (B) LOCATION: 1..22 

I (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

tfZ5 GGAGCAACCG CCTGCTGGGT GC 22 

y (2) INFORMATION FOR SEQ ID NO: 6: 

BO (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 
□ (B) TYPE: nucleic acid 

II (C) STRANDEDNESS : single 
==* (D) TOPOLOGY: linear 

::|5 

4 (ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 
40 (A) NAME /KEY: exon 

(B) LOCATION: 1..21 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CCTGTGTTGC CTGCTGGGAC G 21 

(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 



44 



10 



15 



35 



Attorney Docket No. 026083/0173 

(ii) MOLECULE TYPE: cDNA 



( ix) FEATURE : 

(A) NAME /KEY : exon 

(B) LOCATION: 1..21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGAGACTGAA GTTAGGCCAG C 21 



(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 76 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



25 (ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..76 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GCGGCACCAG GCCGCTGCTG TGATGATGAT GATGATGGCT GCTGCCCATG ACTGCGTAAT 60 
GCGGTCATGA CGCTTT 76 



(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 75 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: exon 
50 (B) LOCATION: 1..75 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
55 GAAGGGGGTG GGGGAGGCAA GGGTACATGA GAGCCATTAC GTCGTCTTCC TGAATCCCTT 60 
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35 



40 



45 



TAGCCGCTTT GCTCG 75 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



15 (ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..22 



20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CCCTATGACG TCCCGGATTA CG 22 

25 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTGGAGTGGT GCCCGGCAAG GG 22 

(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 19 amino acids 
50 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



55 



46 



Attorney Docket No. 026083/0173 



10 



25 



30 



( ix) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..19 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 15 10 15 

Arg Gly Cys 



15 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1310 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..1310 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CCATGGGCTA TCCCTATGAC GTCCCGGATT ACGCAGTCAT GGGCAGCAGC CATCATCATC 60 

35 ATCATCACAG CAGCGGCCTG GTGCCGCGCG GCAGCCATAT GGATCAGAAC AACAGCCTGC 120 

CACCTTACGC TCAGGGCTTG GCCTCCCCTC AGGGTGCCAT GACTCCCGGA ATCCCTATCT 180 

TTAGTCCAAT GATGC CTTAT GGCACTGGAC TGACCCCACA GCCTATTCAG AACACCAATA 240 

40 

GTCTGTCTAT TTTGGAAGAG CAACAAAGGC AGCAGCAGCA ACAACAACAG CAGCAGCAGC 3 00 

AGCAGCAGCA GCAGCAACAG CAACAGCAGC AGCAGCAGCA GCAGCAGCAG CAGCAGCAGC 3 SO 

45 AGCAGCAGCA GCAGCAGCAA CAGGCAGTGG CAGCTGCAGC CGTTCAGCAG TCAACGTCCC 420 

AGCAGGCAAC ACAGGGAACC TCAGGCCAGG CACCACAGCT CTTCCACTCA CAGACTCTCA 480 

CAACTGCACC CTTGCCGGGC ACCACTCCAC TGTATCCCTC CCCCATGACT CCCATGACCC 540 

50 

CCATCACTCC TGCCACGCCA GCTTCGGAGA GTTCTGGGAT TGTACCGCAG CTGCAAAATA 600 

TTGTATCCAC AGTGAATCTT GGTTGTAAAC TTGACCTAAA GACCATTGCA CTTCGTGCCC 660 

55 GAAACGCCGA ATATAATCCC AAGCGGTTTG CTGCGGTAAT CATGAGGATA AGAGAGCCAC 720 

47 
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GAACCACGGC ACTGATTTTC AGTTCTGGGA AAATGGTGTG CACAGGAGCC AAGAGTGAAG 780 

AACAGTCCAG ACTGGCAGCA AGAAAATATG CTAGAGTTGT ACAGAAGTTG GGTTTTCCAG 84 0 

CTAAGTTCTT GGACTTCAAG ATTCAGAACA TGGTGGGGAG CTGTGATGTG AAGTTTCCTA 900 

TAAGGTTAGA AGGCCTTGTG CTCACCCACC AACAATTTAG TAGTTATGAG CCAGAGTTAT 960 

TTCCTGGTTT AATCTACAGA ATGATCAAAC CCAGAATTGT TCTCCTTATT TTTGTTTCTG 1020 

GAAAAGTTGT ATTAACAGGT GCTAAAGTCA GAGCAGAAAT TTATGAAGCA TTTGAAAACA 1080 

TCTACCCTAT TCTAAAGGGA TTCAGGAAGA CGACGTAATG GCTCTCATGT ACCCTTGCCT 1140 

15 CCCCCACCCC CTTCTTTTTT TTTTTTTAAA CAAATCAGTT TGTTTTGGTA CCTTTAAATG 1200 

GTGGTGTTGT GAGAAGATGG ATGTTGAGTT GCAGGGTGTG GCACCAGGTG ATGCCCTTCT 1260 

GTAAGTGCCC CTTCCGGCAT CCCGGAATTC CTGCAGCCCA ACGCGGCCGC 1310 



10 



20 



55 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH: 4286 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: cDNA 



(ix) FEATURE : 

(A) NAME /KEY: exon 
35 (B) LOCATION: 1..4286 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

40 GAATTCCCCT GCAGGTCACT TAGCGTTGGC CACATAGTAG GTTCT CAAAT ACTTGTTAAT 60 

AAATAAGTTT GTTCGAGAAG CTGGGCAATG ATATTCTACA GCTGGAAGAA GAAACATAAT 120 

GATCTAGTAA TTAGCTCAAT TAAAAATAAA CGTTCTTCTT TCCTCAGAGG AGCATTTCCC 180 

45 

AAGGCCTGCC TTGATAGCCA TCCAAAAAGG CCAAGCTCAT CCAATCTTGC CCTAGATTTA 240 

TGCTAAAATG CAGTTACAAT CGATAGGATG ACAGAAAACG ACAGCACTTA TTTAAATATA 300 

50 ATAGGCACTT ATTTAAATAG GAGAAGCTGT GACTTCATAG CAAGTGTTGG GGTTAGGAAA 360 

CTGGGTGGAT AAACTTGCTG ATGCTGTAGA TCTTAGCCTC TACATGAGAT CATGTGGAAA 420 

ATCTGAAAGC ATTTTAGGTT CCTTATGTTT GCAATCAAAT AACTGTACAC CTTTTAATTT 4 80 



AAAAAGTACC ATGAGGCACA CACACACACT CGCAGGAACT TTTTGGCGTA ACAAAACTAG 540 

48 
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AATTAGATCT 


AAAAGCTAAC 


TGTAGGACTG 


AGTCTATTCT 


AAACTGAAAG 


CCTGGACATC 


600 


5 


1 (otiAlj I AL-LIA 




ACGTGTTACG 


GGCTTCCATA 


AAAGCAGCTG 


GCTTTGAATG 


660 


• 


VjAAUiiACjL. LA 


AGAGGCCAGC 


ACAGGAGCGG 


ATTCGTCGCT 


TTCACGGCCA 


TCGAGCCGAA 


720 






/-I rp /I /"V* A C* 


CGTTAAGGAG 


GCCCCCAGTC 


CCGACCCTTC 


GCCCCAAGCC 


780 


in 




LCCGGGLCTGr 


GTACTCCTTG 


CCACACGGGA 


GGGGCGCGGA 


AGCCGGGGCG 


840 








CTGGGCTGAG 


ACCCGCAGAG 


GAAGACGCTC 


TAGGGATTTG 


900 


15 


TPl^PrT 1 A f~ ,l T'7\ 




AAGGCTGAGG 


ACGGGAGGCT 


GATTGAGAGG 


CGAAGGTACA 


960 




AATACAACCT 


TTGGAGCTAA 


GCCAGCAATG 


GTAGAGGGAA 


GATTCTGCAC 


1020 






GGCGGCCTCC 


CCGTCACCAC 


X4X4^X1X4/"«Xlm -^t X^ 

CCCCCCCAAC 


CCGCCCCGAC 


CGGAG CTGAG 


1080 


on 
zu 


AGTAATTCAT 


ACAAAAGGAC 


TCGCCCCTGC 


CTTGGGGAAT 


CCCAGGGACC 


GTCGTTAAAC 


1140 




TLt CACTAAC 


GTAGAACCCA 


GAGATCGCTG 


CGTTCCCGCC 


CCCTCACCCG 


CCCGCTCTCG 


1200 


'=::: 25 


rp/^« 7\ rp/*i 7\ /-HTV"* TV 


GGTGGAGAAG 


AGCATGCGTG 


AGGCTCCGGT 


GCCCGTCAGT 


GGGCAGAGCG 


1260 


LAUAI CCjCCO 


ACAGTCCCCG 


AGAAGTTGGG 


GGGAGGGGTC 


GGCAATTGAA 


CCGGTGCCTA 


1320 




GAGAAGGTGG 


CGCGGGGTAA 


ACTGGGAAAG 


TGATGTCGTG 


TACTGGCTCC 


GCCTTTTTCC 


1380 


---- jU 


CGAGGGTGGG 


GGAGAACCGT 


ATATAAGTGC 


AGTAGTCGCC 


GTGAACGTTC 


TTTTTCGCAA 


1440 




CGGGTTTGCC 


GCCAGAACAC 


AGGTAAGTGC 


CGTGTGTGGT 


TCCCGCGGGC 


CTGGCCTCTT 


1500 


35 


TACGGGTTAT 


GGCCCTTGCG 


TGCCTTGAAT 


TACTTCCACG 


CCCCTGGCTG 


CAGTACGTGA 


1560 


TTCTTGATCC 


CGAGCTTCGG 


GTTGGAAGTG 


GGTGGGAGAG 


TTCGAGGCCT 


TGCGCTTAAG 


1620 






GCCTCGTGCT 


TGAGTTGAGG 


CCTGGCCTGG 


GCGCTGGGGC 


CGCCGCGTGC 


1680 


Aft 


/"< 7\ TV m/ — imn »■ ii i i*t 

GAATCTGGTG 


GCACCTTCGC 


GCCTGTCTCG 


CTGCTTTCGA 


TAAGTCTCTA 


GCCATTTAAA 


1740 




Ai."l"l"l"lGATG 


ACCTGCTGCG 


ACGcrrrm' 


TCTGGCAAGA 


TAGTCTTGTA 


AATGCGGGCC 


1800 


45 


AAGATCTGCA 


CACTGGTATT 


TCGG'rri'lTG 


GGGCCG CGGG 


CGGCGACGGG 


GCCCGTGCGT 


1860 


LLLA6CGCAC 


ATGTTCGGCG 


AGGCGGGGCC 


TGCGAGCGCG 


GCCACCGAGA 


ATCGGACGGG 


1920 






ACjCTGGCCGG 


CCTGCTCTGG 


TGCCTGGCCT 


CGCGCCGCCG 


TGTATCGCCC 


1980 


50 


CGCCCTGGGC 


GGCAAGGCTG 


GCCCGGTCGG 


CACCAGTTGC 


GTGAGCGGAA AGATGGCCGC 


2040 




TTCCCGGCCC 


TGCTGCAGGG 


AGCTCAAAAT 


GGAGGACGCG 


GCGCTCGGGA 


GAGCGGGCGG 


2100 


55 


GTGAGTCACC 


CACACAAAGG 


AAAAGGGCCT 


TTCCGTCCTC 


AGCCGTCGCT 


TCATGTGACT 


2160 


CCACGGAGTA 


CCGGGCGCCG 


TCCAGGCACC 


TCGATTAGTT 


CTCGAGCTTT 


TGGAGTACGT 


2220 



49 



CGTCTTTAGG TTGGGGGGAG GGGTTTTATG CGATGGAGTT 
AGACTGAAGT TAGGCCAGCT TGGCACTTGA TGTAATTCTC 

5 

AGTTTGGATC TTGGTTCATT CTCAAGCCTC AGACAGTGGT 
TTCAGGTGTC GTGAGGAATT GCCCGGGGGA TCCATGGGCT 
10 TACGCAGTCA TGGGCAGCAG CCATCATCAT CATCATCACA 
GGCAGCCATA TGGATCAGAA CAACAGCCTG CCACCTTACG 
CAGGGTGCCA TGACTCCCGG AATCCCTATC TTTAGTC CAA 

15 

CTGACCCCAC AGCCTATTCA. GAACACCAAT AGTCTGTCTA 
CAGCAGCAGC AAGAACAACA. GCAGCAGCAG CAGCAGCAGC 
20 CAGCAGCAGC AGCAGCAGCA GCAGCAGCAG CAGCAGCAGC 
GCAGCTGCAG CCGTTCAGCA GTCAACGTCC CAGCAGGCAA 
GCACCACAGC TCTTCCACTC ACAGACTCTC ACAACTGCAC 

25 

CTGTATCCCT CCCCCATGAC TCCCATGACC CCCATCACTC 
AGTTCTGGGA TTGTAC CGCA GCTGCAAAAT ATTGTATCCA 
30 CTTGACCTAA AGACCATTGC ACTTCGTGCC CGAAACGCCG 
GCTGC GGTAA TCATGAGGAT AAGAGAGCCA CGAACCACGG 
AAAATGGTGT GCACAGGAGC CAAGAGTGAA GAACAGTCCA 

35 

GCTAGAGTTG TACAGAAGTT 1 GGGTTTTCCA GCTAAGTTCT 
ATGGTGGGGA GCTGTGATGT GAAGTTTCCT ATAAGGTTAG 
40 CAACAATTTA GTAGTTATGA. GCCAGAGTTA TTTCCTGGTT 
CCCAGAATTG TTCTCCTTAT TTTTGTTTCT GGAAAAGTTG 
AGAGCAGAAA TTTATGAAGC ATTTGAAAAC ATCTACCCTA 

45 

ACGACGTAAT GGCTCTCATG TACCCTTGCC TCCCCCACCC 
ACAAATCAGT TTGTTTTGGT ACCTTTAAAT GGTGGTGTTG 
50 TGCAGGGTGT GGCACCAGGT GATGCCCTTC TGTAAGTGCC 
CCTGCAGCCC AACACGGCCG CTCGAGCATG CATCTAGAGA 
CTGTGCCTTC TAGTTGCCAG CCATCTGGTT GTTTGCCCCT 

55 

CTGGAAGGTG CCACTCCCAC TGTCCTTTCC TAATAAAATG 

50 
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TCCCCACACT GAGTGGGTGG 2280 

CTTGGAATTT GCCCTTTTTG 2340 

TCAAAGTTTT TTTCTTCCAT 2400 

ATCCCTATGA CGTCCCGGAT 24 60 

GCAGCGGCCT GGTGCCGCGC 2520 

CTCAGGGCTT GGCCTCCCCT 2580 

TGATGCCTTA TGGCACTGGA 2640 

TTTTGGAAGA GCAACAAAGG 2700 

AGCAGCAACA GCAACAGCAG 2760 

AGCAGCAGCA ACAGGCAGTG 2820 

CACAGGGAAC CTCAGGCCAG 2880 

CCTTGCCGGG CACCACTCCA 2940 

CTGCCACGCC AGCTTCGGAG 3000 

CAGTGAATCT TGGTTGTAAA 3060 

AATATAATCC CAAGCGGTTT 3120 

CACTGATTTT CAGTTCTGGG 3180 

GACTGGCAGC AAGAAAATAT 3240 

TGGACTT CAA GATTCAGAAC 3300 

AAGGCCTTGT GCTCACCCAC 3360 

TAATCTACAG AATGATCAAA 3420 

TATTAACAGG TGCTAAAGTC 3480 

TTCTAAAGGG ATTCAGGAAG 3540 

CCTTCTTTTT TTTTTTTTAA 3 600 

TGAGAAGATG GATGTTGAGT 3660 

CCTTCCGGCA TCCCGGATAT 3720 

ACGTCACGGC CGCGATCCCC 3780 

CCCCCGTGCC TTCCTTGACC 3840 

AGGAAATTGC ATCGCATTGT 3900 
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CTGAGTAGGT GTCATTCTAT TCTGGGGGGT GGGGTGGGGC AGGACAGCAA GGGGGAGGAT 3 960 

TGGGAAGACA ATAGCAGGCA TGCTGGGGAT GCGGTGGGCT CTATGGGTAC CCAGGTGCTG 4 020 

5 

AAGAATTGAC CCGGTTCCTC CTGGGCCAGA AAGAAGCAGG CACATCCCCT TCTCTGTGAC 4080 

ACACCCTGTC CACGCCCCTG GTTCTTAGTT CCAGCCCCAC TCATAGGACA CTCAACTTGG 4140 

10 AGCGGTCTCT CCCTCCCTCA TCAGCCCACC AAACCAAACC TAGCCTCCAA GAGTGGGAAG 4200 

AAATTAAAGC AAGAAGGCTA TTAAGTGCAG AGGGAGAGAA AATGCCTCCA ACATGTGAGG 4260 

AAGTAATGAT AGAAATCATA GAATTC 4286 

15 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 32 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY: exon 
30 (B) LOCATION: 1..3263 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

35 ATCGATAAGC TGAGATCCGG CTAGAAACTG CTGAGGGCTG GAC CGCATCT GGGGACCATC 60 

TGTTCTTGGC CCTGAGCGGG GCAGGAACTG CTTACCGCAG ATATCCTGTT TGCCCCAATT 120 

CAGCTGTTCC ATCTGTTCTT GGCCCTGAGC GGGGCAGGAA CTGCTTACCA CAGATATCCT 180 

40 

GTTTGGCCCA TATTCAGCTG TCTCTCTGTT CCTGACCTTG ATCTGAACTT CTCTATTCTC 240 

AGTTATGTAT TTTTCCCATG CCTTGCAAAA TGGCGTTACT TAAGCTAGCT TGCCAAACCT 300 

45 ACGGCTGGGG TCTTTCACGT TTATATCTAT GAGGGGAAGG ACCCAGAGTG GGGAAGCTGG 360 

GATCTTGGGA ACACGCTTCT CTACATGGCA TTGTCTGCAC GGTGGAGTCC GGATCTGAGC 420 

TTGGCTTGGT TTTTAAAACC AGCCTGGAGT AGAGCAGATG GGTTAAGGTG AGTGACCCCT 480 

50 

CAGCCCTGGA CATTCTTAGA TGAGCCCCCT CAGGAGTAGA GAATAATGTT GAGATGAGTT 540 

CTGTTGGCTA AAATAATCAA GGCTAGTCTT TATAAAACTG TCTCCTCTTC TCCTAGCTTC 600 

55 GATCCAGAGA GAGACCTGGG CGGAGCTGGT CGCTGCTCAG GAACTCCAGG AAAGGAGAAG 660 

51 



CTGAGGTTAC CACGCTGCGA ATGGGTTTAC GGAGATAGCT 
CGTAAACTCC AGAGCAGCGA TAGGC CGTAA TATCGGGGAA 
5 TTCCACACGT CACATGGGTC GTCCTATCCG AGC CAGTCGT 
GTGCACACTG GCGCTCCAGG GAGCTCTGCA CTCCGCCCGA 
AGGACGCGGG GCGCGTGACT ATGCGTGGGC TGGAGCAACC 

10 

TTGCGCCCGG ACTCGTCCAA CGACTATAAA GAGGGCAGGC 
GACTTCAACG TCCTGAGTAC CTTCTCCTCA CTTACTCCGT 
15 CTCGAGAACG TCTCCCATGG GCTATCCCTA TGACGTCCCG 
CAGCCATCAT CATCATCATC ACAGCAGCGG CCTGGTGCCG 
GAACAACAGC CTGCCACCTT ACGCTCAGGG CTTGGCCTCC 

20 

CGGAATCCCT ATCTTTAGTC CAATGATGCC TTATGG CACT 
TCAGAACACC AATAGTCTGT CTATTTTGGA AGAGCAACAA 
25 ACAGCAGCAG CAGCAGCAGC AGCAGCAGCA ACAGCAACAG 
GCAGCAGCAG CAGCAGCAGC AGCAGCAGCA GCAACAGGCA 
GCAGTCAACG TCCCAGCAGG CAACACAGGG AACCTCAGGC 

30 

CTCACAGACT CTCACAACTG CACCCTTGCC GGGCACCACT 
GACTCCCATG ACCCCCATCA CTCCTGCCAC GCCAGCTTCG 
35 GCAGCTGCAA AATATTGTAT CCACAGTGAA TCTTGGTTGT 
TGCACTTCGT GCCCGAAACG CCGAATATAA TCCCAAGCGG 
GATAAGAGAG CCACGAACCA CGGCACTGAT TTTCAGTTCT 

40 

AGCCAAGAGT GAAGAACAGT CCAGACTGGC AGCAAGAAAA 
GTTGGGTTTT CCAGCTAAGT TCTTGGACTT CAAGATTCAG 
45 TGTGAAGTTT CCTATAAGGT TAGAAGGCCT TGTGCTCACC 
TGAGCCAGAG TTATTTCCTG GTTTAATCTA CAGAATGATC 
TATTTTTGTT TCTGGAAAAG TTGTATTAAC AGGTGCTAAA 

50 

AGCATTTGAA AACATCTACC CTATTCTAAA GGGATT CAGG 
ATGTACCCTT GCCTCCCCCA CCCCCTTCTT TIUTTTTTTT 
55 GGTACCTTTA AATGGTGGTG TTGTGAGAAG ATGGATGTTG 
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GGCTTTCCGG GGTGAGTTCT 720 
AGCACTATAG GGACATGATG 780 
GCCAAAGGGG CGGTCCCGCT 840 
AAAGTGCGCT CGGCTCTGCC 900 
GCCTGCTGGG TGCAAACCCT 960 

TGTCCTCTAA GCGTCACCAC 1020 

AGCTCCAGCT TCACCAGATC 1080 

GATTACGCAG TCATGGGCAG 1140 

CGCGGCAGCC ATATGGATCA 1200 

CCTCAGGGTG CCATGACTCC 12 SO 

GGACTGACCC CACAGCCTAT 1320 

AGGCAGCAGC AGCAACAACA 1380 

CAGCAGCAGC AGCAGCAGCA 1440 

GTGGCAGCTG CAGCCGTTCA 1500 

CAGGCACCAC AGCTCTTCCA 1560 

CCACTGTATC CCTCCCCCAT 1620 

GAGAGTTCTG GGATTGTACC 1680 

AAACTTGACC TAAAGAC CAT 1740 

TTTGCTGCGG TAATCATGAG 1800 

GGGAAAATGG TGTGCACAGG 1860 

TATGCTAGAG TTGTACAGAA 1920 

AACATGGTGG GGAGCTGTGA 19 80 

CACCAACAAT TTAGTAGTTA 2040 

AAACCCAGAA TTGTTCTCCT 2100 

GTCAGAGCAG AAATTTATGA 2160 

AAGACGACGT AATGGCTCTC 2220 

TAAACAAATC AGTTTGTTTT 2280 

AGTTGCAGGG TGTGGCACCA 2340 



52 
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GGTGATGCCC TTCTGTAAGT GCCCCTTCCG GCATCCCGGA ATTCCTGCAG CCCAACGCGG 2400 

CCGCTTCGAG GGATCTTTGT GAAGGAACCT TACTTCTGTG GTGTGACATA ATTGGACAAA 2460 

CTAC CTACAG AGATTTAAAG CTCTAAGGTA AATATAAAAT TTTTAAGTGT ATAATGTGTT 2520 

AAACTACTGA TTCTAATTGT TTGTGTATTT TAGATTCCAA CCTATGGAAC TGATGAATGG 2580 

GAG CAGTGGT GGAATGCCTT TAATGAGGAA AAC CTGTTTT GCTCAGAAGA AATGCCATCT 2640 

AGTGATGATG AGGCTACTGC TGACTCTCAA CATTCTACTC CTCCAAAAAA GAAGAGAAAG 2700 

GTAGAAGACC CCAAGGACTT TCCTTCAGAA TTGCTAAGTT TTTTGAGTCA TGCTGTGTTT 2760 

AGTAATAGAA CTCTTGCTTG CTTTGCTATT T ACAC CACAA AGGAAAAAGC TGCACTGCTA 2820 

TACAAGAAAA TTATGGAAAA ATATTCTGTA AC CTTTATAA GTAGGCATAA CAGTTATAAT 2880 

CATAACATAC TGTTTTTTCT TACTCCACAC AGGCATAGAG TGTCTGCTAT TAATAACTAT 2940 

GCTCAAAAAT TGTGTACCTT TAGCTTTTTA ATTTGTAAAG GGGTTAATAA GGAATATTTG 3000 

ATGTATAGTG CCTTGACTAG AGATCATAAT CAGCCATACC ACATTTGTAG AGGTTTTACT 3060 

TGCTTTAAAA AACCTCCCAC ACCTCCCCCT GAACCTGAAA CATAAAATGA ATGCAATTGT 3120 

TGTTGTTAAC TTGTTTATTG CAGCTTATAA TGGTTACAAA TAAAG CAATA GCATCACAAA 3180 

TTTCACAAAT AAAGCATTTT' TTTCACTGCA TTCTAGTTGT GGTTTGTCCA AACTCATCAA 324 0 

TGTATCTTAT CATGTCTGGA. TCC 3263 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..371 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Val Met Gly Ser Ser 
1 5 10 15 

His His His His His His Ser Ser Gly Leu Val Pro Arg Gly Ser His 
20 25 30 
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10 



25 



40 



55 



Met Asp Gin Asn Asn Ser Leu Pro Pro Tyr Ala Gin Gly Leu Ala Ser 
35 40 45 

Pro Gin Gly Ala Met Thr Pro Gly lie Pro lie Phe Ser Pro Met Met 
50 55 60 

Pro Tyr Gly Thr Gly Leu Thr Pro Gin Pro lie Gin Asn Thr Asn Ser 
65 70 75 80 

Leu Ser lie Leu Glu Glu Gin Gin Arg Gin Gin Gin Gin Gin Gin Gin 
85 90 95 



Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
15 100 105 110 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Ala 
115 120 125 

20 Val Ala Ala Ala Ala Val Gin Gin Ser Thr Ser Gin Gin Ala Thr Gin 

130 135 140 

Gly Thr Ser Gly Gin Ala Pro Gin Leu Phe His Ser Gin Thr Leu Thr 
145 150 155 160 



Thr Ala Pro Leu Pro Gly Thr Thr Pro Leu Tyr Pro Ser Pro Met Thr 
165 170 175 



Pro Met Thr Pro lie Thr Pro Ala Thr Pro Ala Ser Glu Ser Ser Gly 
30 180 185 190 

lie Val Pro Gin Leu Gin Asn lie Val Ser Thr Val Asn Leu Gly Cys 
195 200 205 

35 Lys Leu Asp Leu Lys Thr lie Ala Leu Arg Ala Arg Asn Ala Glu Tyr 

210 215 220 

Asn Pro Lys Arg Phe Ala Ala Val lie Met Arg lie Arg Glu Pro Arg 
225 230 235 240 



Thr Thr Ala Leu lie Phe Ser Ser Gly Lys Met Val Cys Thr Gly Ala 
245 250 255 



Lys Ser Glu Glu Gin Ser Arg Leu Ala Ala Arg Lys Tyr Ala Arg Val 
45 260 265 270 

Val Gin Lys Leu Gly Phe Pro Ala Lys Phe Leu Asp Phe Lys He Gin 
275 280 285 

50 Asn Met Val Gly Ser Cys Asp Val Lys Phe Pro He Arg Leu Glu Gly 

290 295 300 



Leu Val Leu Thr His Gin Gin Phe Ser Ser Tyr Glu Pro Glu Leu Phe 
305 310 315 320 

Pro Gly Leu He Tyr Arg Met He Lys Pro Arg He Val Leu Leu He 
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20 



30 



325 330 335 

Phe Val Ser Gly Lys Val Val Leu Thr Gly Ala Lys Val Arg Ala Glu 
340 345 350 

He Tyr Glu Ala Phe Glu Asn He Tyr Pro He Leu Lys Gly Phe Arg 
355 360 365 



Lys Thr Thr 
10 370 



(2) INFORMATION FOR SEQ ID NO: 17: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



( ix) FEATURE : 
25 (A) NAME/KEY: Protein 

(B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly 
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